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• Abstract 

■ A knowledge system S describing a part of real world does in general 

not contain complete information. Reasoning with incomplete informa- 
tion is prone to errors since any belief derived from S may be false in the 
present state of the world. A false belief may suggest wrong decisions and 
lead to harmful actions. So an important goal is to make false beliefs as 
unlikely as possible. This work introduces the notions of typical atoms and 
typical models^ and shows that reasoning with typical models minimizes 
the expected number of false beliefs over all ways of using incomplete in- 
formation. Various properties of typical models are studied, in particular, 
correctness and stability of beliefs suggested by typical models, and their 
. connection to oblivious reasoning. 

Keywords: Incomplete information, reasoning errors, false beliefs, typical 
models, evidence, oblivious reasoning, counting models. 
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00 ■ 1 Introduction 

rn ' 

\ Let us consider a knowledge system S describing a part of real world. The 

' knowledge contained in the system consists of data describing properties of 

various objects of the world, their mutual relationship, laws governing their 
behavior and evolution. For example, consider a system S of medical knowledge 
about the "world" of a hospital. S contains description of diseases (their causes, 
^ ■ development, consequences, examination, symptoms, treatment, prevention), 

\ information about various medicament (their composition, therapeutic activity, 

dosage, directions for use, interactions, side effects), description of the hospital 
(its structure, management, location), personal data of the hospital patients 
(their medical history, test results), general rules of medicine, etc. An important 
decision that has to be made by a physician for his or her patient is determining 
the right diagnosis and the best treatment. The physician may wish to consult 
the vast amount of knowledge collected in the system. Will the system help the 
physician to make a right decision? This depends to a large extent on the way 
the knowledge is used for deriving conclusions. 

Let us define important features of S and their correspondence to the world 
it describes. 

S is presented in a first order language. S is consistent having a 
set MOD{S) of models. Each model of 5 is a set of ground atomic 
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formulas expressed in terms of values assigned to various objects 
and parameters of the world (such as names of patients, quantities 
of medicament, etc.). 

The multitude of models of S reflects uncertainty regarding actual 
values of some of these parameters. For instance, as long as neither 
a final diagnosis nor a treatment of a patient A is determined, there 
are, say, two possible diagnoses: Di (with possible treatments Ti or 
T2) or D2 (with Ta or T4). So MOD{S) may contain four different 
models, each including Di Ti or Di T2 or D2 T3 or D2 T4. 

A set of values of all parameters of the world (including those not 
presented in S) determines its state. With regard to patient A the 

hospital world has at least four possible states, each represented in 
S by one of the four models mentioned above. 

Since the available knowledge of the real world is incomplete in 
general, it may happen that a model /Li of 5 represents several states 
of the world that differ in the reality but are indistinguishable from 
the point of view of the information presented in S. For S the states 
represented by fj, constitute an equivalence class of possible states 
of the world called a possible world (denoted w in the sequel). 

S describes the world faithfully in the sense that every possible state 
of the world belongs to a possible world represented by a model of S, 
and every model of S represents a non-empty set of possible states 
of the world. From the point of view of S the real world appears 
as a set W of possible worlds such that there is a bijection between 
MOD{S) and W. 

A user of S having a query whether a formula F is true in the present state 
of the world applies to S and expects to get an answer based on the information 
stored in S. If F (or -iF) is a logical consequence of S then S contains complete 
information about F. In this case F is true (false) in all models of S and all 
possible worlds. Hence F is certainly true (false) in the present state. However, 
if neither F nor -iF follows from S then the information in S regarding F is 
incomplete and does not facilitate derivation of a definite answer to the query. 
But this does not diminish the need or importance of a reasonable answer. A 
physician cannot delay for a long time a treatment of a patient just because 
he or she is not yet certain about the final diagnosis. Travelers reaching a 
crossroads would not just stay there even if they are not sure which way leads 
to their destination. 

Mark Twain wrote, "The trouble with the world is not that people know 
too little, but that they know so many things that ain't so." Even given an 
extensive knowledge of the world, its incompleteness makes erroneous judgment 
inevitable. In the present state of the world a formula F has a certain value 
although this value may be uncertain from the standpoint of S. If the informa- 
tion about F contained in S is incomplete (briefly, F is incomplete in S), we 
have to find a way of reasoning producing a belief regarding F which is credible 



2 



in the sense that it stands a good chance of being true in the reahty. So a 
system of automated reasoning must be able to answer the following query: 



Given a formula F and a system S describing faithfully a world, 
what is a most credible belief regarding the truth of F in the present 
state of the world? 

In order to answer various multiple queries consistently a reasoner has to 
choose one particular model /i of 5 (a preferred model) and then believe that F 
is true in the reality iff /x |= F. If 5 ^ F or |= -iF then the choice of /i does 
not matter; however, if F is incomplete in S then F is true in some models, 
but false in the others. Which is the correct value of F in the present state of 
the world? With any choice of // there is a non-zero probability that the belief 
in F implied by jj, is false in the present state of the world. So reasoning with 
incomplete information is prone to errors. 

The way of choosing the preferred model provides a semantics for the pro- 
cess of reasoning. Whatever this way is, errors are inevitable since the pre- 
ferred model may not fully conform to the present state of the world. The 
smaller the expected number (or the severity) of errors, the more reliable 
the semantics. Numerous approaches to reasoning with inc omplete in f orma- 
tion have been developed including Nonmonotonic Logics (jAntonioul . 119971 : 
Brewka et al.l. 120071: IShoham . 198 71) and methods ba s ed on Semantics of Mini- 
m,al Models dBidoit et aiTfl98fil : ICelfond et al.l . Il988l : iMcCarthvl . Il980l : iMinkei] . 



19821 : IVan Gelder et al.l . 
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Neither of the previous work considered mini- 
mization of the risk that beliefs sanctioned by the proposed semantics are false 
in the real world. A false belief may suggest wrong decisions and lead to harm- 
ful actions. As reasoning errors caused by incompleteness of information are 
inevitable, minimization of the number and likelihood of false beliefs becomes 
practically important a goal. 

The following sections introduce the semantics of typical models and show 
that it minimizes the expected number of erroneous beliefs over all ways of 
reasoning with incomplete information. 



2 Evidence 

At any moment the world is in exactly one of its possible states, so in exactly 
one of possible worlds represented by the corresponding model of S. Let p{w) 
denote the probability that at a randomly chosen moment the world is in a state 
belonging to a possible world w (z W represented by a model // of 5. Then to 
every /x € MOD{S) representing the corresponding w ^ W one may assign a 
probability p{n) = p{w) such that T.i,:^MOD{s)Pil^) = E«,eiyP(^) = 1- So the 
probability p{F) that a formula F is true in the present state of the world is 

p[F)= P(^'^ W 

fj.eMOD{SU{F}) 

where MOD{S U {F}) = G MOD{S) A ^ F} is the set of models of S 
implying F. 
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If p{F) > 0.5 then it is reasonable to believe that F is more likely to be 
true than false in the present state of the world, and the larger p{F), the more 
credible this belief. 

The problem, however, is that in most practical cases there is no reliable in- 
formation regarding the distribution p{w). In the absence of this information 
let us assume just for the moment that all possible worlds are equiprobable, and 
sets W and MOD{S) are finite. Appendix B shows a way to relax these limi- 
tations in case that certain knowledge is available about probability of possible 
worlds and structure of a given system and its domain. 

The assumptions of the previous paragraph lead to the following approach. 

Definition 2.1 (Principle of majority of models, PMM) Believe that a 
formula F is more likely to be true than false in a state of the world if F is 
true in a majority of models of S. The larger the majority, the more credible 
the belief. □ 

A reasonable semantics should respect the power of majority; indeed, F 
is true (false) in S if it is true (false) in all models of F. Obeying such an 
unanimity, would it be reasonable to disregard a majority of 99.9% or even 
80%? 

As PMM suggests a belief regarding the truth value of F, we may say that 
the set of models of S offers an evidence of F, E(S,F). We would like the 
evidence to provide a quantitative measure of credibility of the corresponding 
belief. To normalize the value of evidence for all S and F such that < 
E{S, F) < 1, it is reasonable to require that E{S, F) = 1 for S" ^ F, E{S, F) = 
for S \= -i F, and E(S, F) + E{S, -iF) = 1 for all S, F. More requirements are 
presented in iLozinskii leading to the following definition. 



Definition 2.2 Evidence of F in S: 

\MOD{SU{F})\ 
\MOD{S)\ ■ ° 

PMM suggests that F is true if E{S, F) > 0.5 or E{S, F) = 0.5 (the latter 
is chosen to avoid ambiguity; see also footnote 3). Given a query regarding 
the truth value of F, a reasoner may not only return 'true' or 'false', but also 
attach the value of E{S, F) to the answer to give a measure of credibility of the 
latter. In cases where accepting an erroneous answer can have very undesirable 
consequences, a query can require a certain level of credibility, for example, 
ignoring answers with evidence less than 0.9. 



3 Oblivious vs. non-oblivious reasoning 

In the absence of sufficient statistical information the evidence E{S, F) is re- 
garded as an approximation of the probability p{F) that F is true in a randomly 
chosen possible world, that is the probability that the belief in the truth of F 
is correct in the present state of W. 



4 



Consider a reasoner R that forms beliefs in order to answer a series of queries 
Fi, . . . , Ffc. Denote by R{Fi) his behef regarding the truth of Fj. If the reasoner 
computes R{Fi) as his answer to Fi without taking into account the previous 
beliefs R{Fj) {1 < j < i) preceding R{Fi), let us call this way of reasoning 
oblivious. Then if it turns out that there is no model of S in which all of 
R{Fi), . . . , R{Fi) are true, then the beliefs of R are inconsistent with S which 
is unacceptable. 

Oblivious reasoning with incomplete information may lead to inconsistency. 
Indeed, let Mj^^p.-^ denote a set of all models of S in which R{Fi) is true. Then 
the set of beliefs {R{Fi), . . . , R{Fk)] is consistent with S (and so, holds in some 
state of W) iff 



For all queries F incomplete in 5, M^(p) is a proper subset of MOD{S), so 
the size of their intersection (3) is a monotone decreasing function of k such 
that for a large k condition (3) may not hold 0. This does not happen if 
the reasoning is non-oblivious such that in derivation of R{Fi) all previously 
produced beliefs are taken into consideration. One way of doing so is to derive 
R{Fi) from S U {R{Fi), . . . , In this case, however, the value of each 

belief depends on the order of queries in their sequence. 

Example 3.1 S = {a V 6, 6 V c, c V a, -.a V -.6 V -.c}; 
MOD{S) = {{a, 6, ^c}, {a, ^6, c}, {^a, 6, c}}. 

Queries: Fi = a, F2 = b, F3 = c; k = 3; E{S, a) = E{S, h) = E{S, c) = 2/3. 
Obliviously: R{a) = R{h) = R{c) — true! which is inconsistent with S. 
Non- obliviously: let Sq = S, Si = Si^i U {R{Fi)} for 1 < i < k. Then 
E{So, a) = 2/3; R{a) =' true'; Si = {a,bV c, ^6 V ^c}; 
F(5i,6) = 1/2; R{b) =' true'; S2 = {a,b,^c}; 

E{S2,c) = 0;R{c) — false'. All these beliefs hold in the first model of S. □ 

Non-oblivious reasoning requires keeping track of many previously produced 
beliefs, so in general it is more time-consuming than its oblivious counterpart. 
Thus it would be helpful to determine sets of queries that can be answered 
obliviously in any order without any risk of inconsistency. A trivial example is 
a set of all formulas F such that S \= F. Subsection 4.3 presents less obvious 
sets allowing oblivious reasoning. 

4 Semantics of typical models 

This section introduces the basic notions of typical atoms and typical models, 
and studies stability of the corresponding beliefs. 

^For instance, in Example 3.1 expression (3) holds for fc = 2, but not for k — 3. 
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4.1 Typical atoms 

S is supposed to be formulated in a first order language, so using the terminol- 
ogy of Predicate Calculus let the base of 5 be a set of all ground instances of 
all atomic formulas corresponding to all predicates occurring in S: 

BaseiS) = {P^'Hh,...,tk)}, (4) 

such that P^'^^xi, . . . ,Xk) occurs in S, Di is the domain of Xi, and ti € Di 
for 1 < i < k. 

Definition 4.1 For each ground atomic formula a € Base{S) let a denote the 
typical atom corresponding to a such that the evidence of a is at least as large 
as that of -la. 

a i/S(5,a)>0.5 , 
-la otherwise 

For a formula F we define its typical value F by substituting F for a in ex- 
pression (5). 

Evidence -E'(>S', a) is introduced in order to be used as an approximation of 
p{a). The larger the difference \E{S,a) — 0.5|, the better this approximation. 
However, if E{S, a) is close to 0.5, it may diverge from p{a) even qualitatively 
such as a) > 0.5 but p{a) < 0.5. But in the absence of a sufficient statistics 
regarding p{w) one has to rely on E(S, a) and believe that any typical atom is 
not less likely to be true than false in the present state of the world. On the 
other hand, the need to avoid inconsistency may force a non-oblivious reasoner 
to adopt beliefs in negation of some typical atoms. So questions arising in any 
reasoning system intended for answering multiple queries are: 

Given a set of queries {Fi, . . . , F^}, is there a state of the world in 
which beliefs in the truth of typical values {Fi, . . . ,Fk} hold for all 
1 < i < A;, i.e. is there a model m of 5 such that m \= A^=i ^i'^ 
What is the value of evidence E{S, Ai=i -^O"^ 

The answer to the first question is positive if the latter evidence is larger than 
zero. 

Let A^'^^ be a set of k literals / such that / € {a,-ia}, a G Base{S), and 

(k) A (k) 

A\ = /\i^j^{k) I- The following theorem estimates the value of evidence of A\ . 
Theorem 4.1 max(0,a,/3) < E{S,A^^^) < min(l,7), where 

a= ^(-^'O - k + 1, (6) 

r,\Base(S)\-k(r,k _ -,\ c)\Base(S)\-k 
B = l- - ^- T = - {1\ 

^ \MOD{S)\ ' ^ \MOD{S)\ ' ^ ' 



^If E{S,a) = 0.5 then E{S,a) — E{S,^a), and so any one of a or -la can be considered a 
typical atom. However, in practice a proper choice of one of a or -la should be made based 
on relevant knowledge of the real world. 
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Proof. First, we prove by induction on k that 

|M0L>(5Uyl(^))| > \MOD{S U {l})\ - (k - l)\MODiS)\. (8) 

Base. For k = 1 inequality (8) holds trivially. 
Step. Let Mi,M2 be subsets of MOD{S), then 

|Mi nMal > |Mi| + IM2I - \MOD{S)\. 

If inequality (8) holds for all 1 < i < j, then it holds for j + I. Indeed, let 
^0+1) ^ ^(i) u 1^/}^ ttien 

\MOD{S U y4(^+^))| = |MOL»(5 U A^^^) n MOD{S U {Z'})| > 
^ |MOL'(5 U {l})\ - (j - l)|MOL>(5)| + \MOD{S U - \MOD{S)\ = 

^ |MOI)(5u{/})|-j|MOD(5)|. 
Further, by (8) and since evidence is a non-negative value. 

Next, is true in 2l^°'*'^('^)l~'^ interpretations of S but false in the rest of 
them. So 

2\Base{S)\(i_2-k) ... ( 2lS"*e(5)|-fc \ 



Expressions (9) and (10) complete the proof. □ 

Let T'^^^ be a set of A; formulas, and T^l^^ = [\p^j:(k) F. If ^('^^ is replaced 
with J^e^) then Theorem 4.1 implies the following 

Corollary 4.1 E{S, F^) > Efe^Cfc) ^("9, F)-k + l; 

(a) For all formulas if E{S,(f))> 0.5 then (j)Aip 

is consistent with S; 

(Hi) If E{S,F) = 0.5 call F a neutral formula. If there are two neutral 
formulas in a set then T^^^ may be inconsistent with S. □ 

4.2 Typical models 

Let T{S) denote the set of all typical atoms of S, and T{m) be the set of all 
typical atoms contained in a model m: 

T{S) = {a\a e Base{S)}, T{m) = {a\a G m} = T{S) n m. (11) 
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Definition 4.2 If there exists a model ^ of S such that T{ii) = T{S) then jj, is 
the most typical model of S. For all m G MOD{S), if there is no model m' of 
S such that T{m) C T{m') then m is a typical model of S. □ 

A system 5 may have no most typical model, but every S has a typical 
one. Indeed, every typical atom a is consistent with 5, so there is a model m 
containing d. Either m is a typical model of S or there is a typical model jU 
such that r(m) C r(/x). 

Suppose, a rcasoncr R prefers a model m assuming that it describes most 
trustfully the present possible world w. Then m represents the set of i?'s beliefs, 
but because of incompleteness of S some of the beliefs may be false in w. 

Definition 4.3 A formula F is false in a possible world w with probability 
1 — p{F) (expression (1)). Let the erratum ER{A) of a set of literals A be the 
expected proportion of its literals that are false in a randomly chosen possible 
world w. Then taking E{S,l) as an approximation of p{l) we get 

ERiA) = l--^S2EiS,l). (12) 
1^1 leA 

Theorem 4.2 (i) For all m G MOD{S) there is a typical model ix of S such 
that ER{n) < ER{m). 

(a) If fi is the most typical model of S then for all m G MOD(S) 
ERiii) < ER{m). 

Proof, (i) If m is a typical model of 5 then (i) holds trivially, else there is 
a typical model ji such that T(m) C T'(/x). Denote 6i = T{iJ,) — T(m) = fi — m, 
S2 = m — fi, B = \Base{S)\. All literals of Si are typical atoms. There is 
a bijection between di and 62 such that to every literal a e Si corresponds 
-la G (52. Since for all a G Base{S) E{S, d) > E{S, -id), 

ER{^) - ER{m) = 4 y (^(•^' - a)) < 0. (13) 
B t~t 

aeoi 

(ii) If /Lt is the most typical model of S then for all m G MOD{S) 

T{m) C T{fi), hence ER{n) < ER{m). □ 

By Theorem 4.2, if there exists the most typical model of S then it is the 
most trustworthy one among all models of S. Otherwise there is a typical model 
with a minimum value of erratum among all models of S. 

Let ER{mtm), ER{rand), ER{worst), E{S) denote respectively the erra- 
tum of the most typical model, the expected erratum of a randomly chosen 
model, the erratum of a model containing no typical atoms, the average evi- 
dence of a typical atom of S. Then 

E{S) = J2 ^iS,a), ER{mtm) = 1 - E{S), (14) 

a€Base{S) 

ER{rand) = ^ V E{S,a){l - E{S,a)), ER{warst) = E{S) (15) 

aeBase{S) 
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such that 



ER(rand) , ER(worst) , , 

hm — ^ f = 2, hm — f = oo. (16) 

E{S)^i ER[mtm) E{S)^i ER(mtm) 

Since the most typical model of a given system would be the most trustwor- 
thy one, it should be preferred by any rational reasoner. So the existence of a 
most typical model is a practically important characteristic of any knowledge 
system. Let p{mtm) denote the probability that a given system S has the most 
typical model. The probability that a randomly chosen model of S contains all 
typical atoms is Wa&Base{S) E{S,a). Then 

f V 

p{mtm) = 1-11- Yl ^iS,a)\ (17) 

\ aeBase(S) / 

where M = \MODiS)\ and 2'^ <l\a^^^,,^s) E{S,d) < (EiS))^ . 
So 

1 - (1 - 2--^)*^ < p{mtm) < 1 - (1 - {E{S) f)^^. (18) 

Expression (18) provides rather rough bounds for p{mtm). Experimental 
estimation of p{mtm) is presented in Section 7. 



4.3 Typical kernel 

Since a system S may be inconsistent with the set T{S) of all its typical atoms, 
it may have no most typical model. But S must have a typical model containing 
a subset of T{S) consistent with S. It would be helpful to characterize a subset 
of typical atoms of any system S that is necessarily consistent with S regardless 
of beliefs assigned to other atoms of S. If for a given system this subset is non- 
empty then queries about atoms of the subset can be answered obliviously in 
any order. 

Definition 4.4 (i) Considering any model as a set of literals, call two models 
m', m" a-neighbors if they differ only in the value of an atom a G Base{S) 
such that a G m' , -la G m" , and m' — {a} = m" — {-la}. Let MN{S U {a}), 
MN{S U {-la}) denote sets of all a-neighboring models of S such that every 
model of MN(S U {a}) contains a, every one of MN{S U {"■a}) contains -la, 
and to every model of MN{S U {a}) corresponds exactly one a-neighbor in 
MN{S U {-la}), and vice versa. 

(ii) If for a typical atom a every model of S containing -la has an a-neighbor 
in MN{S U {a}), that is 

MOD{SU{^a}) = MN{SU{^a}), (19) 

then call a a kernel atom possessing the kernel property U9\) . and let the typical 
kernel of S, tk{S), be the set of all kernel atoms of S. Figure 1 illustrates the 
kernel property. □ 
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MOD{SU{a}) 





MOD{S U {^a}) 





MN{SU{^a}) 



MN{S U {a}) 




MOD{S) 




Figure 1: The kernel property of a. 



By the kernel property, tk{S) includes all atoms h such that S \= b since 



Let us say that a formula (f) cancels all models of S in which cf) is false. 

Lemma 4.1 For all kernel atoms a of S and all literals I ^ -'a, if I is consistent 
with S then I is so with S U {a}. 

Proof. Suppose I is consistent with S, but inconsistent with S U {a}, and 
so cancels all models of MOD{S U {a}) including MN{S U {a}). Since I 7^ -.a, 
I cancels all models of MN{S U {^a}) as well. By the kernel property of a, 
MN{S U {^a}) = MOD{S U {^a}). So I cancels all models of MOD{S) and 
becomes inconsistent with S — a contradiction. □ 

Theorem 4.3 For all S, tk{S) is consistent with S. There is no superset of 
tk{S) possessing this property. 

Proof. By induction on the serial number of kernel atoms of S numbered 
arbitrarily in tk{S) = {ai, . . . , 

Base. Include ai into S producing Si = S U {oi}; ai is consistent with S, 
so Si is consistent; but ai cancels all models of S containing -ifii such that 

MOD{Si) = MOD{S) - MOD{S U {-ai}) = MOD{S U {ai}) 7^ 0, 
MOD{Si U {ai}) = MOD{Si), MOD{Si U {-ai}) = 0. 

It turns out that for 1 < i < A; every Oj G tk{S) is a kernel atom of ^i. Indeed, 
(i) by Lemma 4.1 Oj is consistent with Si since it is consistent with S; 



MOD{Su{^b}) = 0. 
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■) 



MN{Si U {^Oi}) = MN{S U {^CLi}) - MOD{S U {-ai}); 
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(iii) since is a kernel atom of S, 

MN{S U {^a^}) = MOD{S U {^ai}). 
So (i) - (iii) imply 

MN{Sly^{^ai}) = MOD{SVJ{^ai})-MOD{S\J{^ai}) = MOD{SiU{^ai}). 
Hence ai has the kernel property in Si. 

Step. Suppose kernel atoms ai,...,ai (1 < i < k) have been included in 
S such that Si = S VJ {ai, . . . , Oj}. Then by the same argument as above Si is 
consistent, and for aXLi < j <k we have Oj G tk{Si). Hence Sk = SVJ tk{S) is 
consistent. 

So all kernel atoms of S can be included into S in any order preserving 
consistency of the augmented set. However, this may not be true regarding a 
non-kernel typical atom 6 of S such that h ^ tk{S). Since h does not possess 
the kernel property, MN{S\J{-4>}) C MOD{SU{-4>}). So unlike the situation 
described by Lemma 4.1, inclusion into S of tk{S) (or even of any literal I 
consistent with S) may cancel all models of MOD{S U {6}) and of MN{S U 
{-■6}). Since the latter set is just a proper subset of MOD{S U {-^h}), we get 
MOD{S U tk{S)) = MOD{S U {^b}) - MN{S U {-S}) ^ 0. Hence b is false 
in all models of MOD{S U tk{S)) and so inconsistent with S U tk{S) (or with 
SujZ}, respectively). Thus typical kernel is the largest set of atoms necessarily 
consistent with any S. □ 

The following algorithm checks for S presented in a propositional CNF 
whether a typical atom a is its kernel atom. 

Algorithm 4.1 (Clauses cE S are sets of literals; a is a typical atom of S). 

1. Count Ni = \MOD{S U {-a})|; 

2. Compute Si = {c— {^a} \ c G S A a c}; 

3. Compute S2 = {c — {a} \ c € S A -ici c}; 

4. Count N2 = \MOD{Si U Sa)!; 

5. If Ni = N2 return "Yes, a is a kernel atom of S" else return "No". □ 

There is a bijection between MOD{Si) and MOS{S U {a}), and between 
MOr>(S'2) and MOD(S' U {^a}) such that to every model m' e MOD{Si) 
corresponds a model {m! U {a}) G MOD{S U {a}) and to every model 
m" € MOD{S2) corresponds a model (m" U {^a}) £ MOD{S U {^a}), and 
vice versa. Since MOD{Si U ^2) = MOD{Si) n MOD{S2), to every model 
m G MOD{Si U ^2) corresponds (m U {a}) G MN{S U {a}) and (m U {-.a}) G 
M7V(5u{^a}), and vice versa. Hence, N2 = \MN{SU{a})\ = \MN{SU{^a})\. 
So line 5 of the algorithm verifies whether a possesses the kernel property. 

Theorem 4.4 For all S, every typical model of S includes tk{S). 

Proof. Suppose a typical model m of 5 does not include tk{S) as it con- 
tains a negation -la of a kernel typical atom a G tk{S). Due to the kernel 
property of a, S has a model fj, that is a-neighbor of m and hence contains a. 
So T{m) C T{p,). Hence, m is not a typical model — a contradiction. By the 
same argument every typical model of S includes tk{S). □ 
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4.4 Stable beliefs 



People are in constant quest for knowledge. The available knowledge about the 
real world is being expanded and deepened. If new knowledge is added to S, 
the set of models of S changes, and so the set of possible worlds W changes as 
well. Indeed, the new knowledge changes the image of the reality portrayed by 
S for its users. The corresponding changes take place in sets of beliefs derived 
from S by its users. Some beliefs regarding formulas incomplete in S become 
more certain, but others turn out to be false. 

This phenomenon makes reasoning with incomplete information 
nonmonotonic: while S grows, the set of belies and conclusions derived from 
S may shrink. The possibility that some beliefs may become false is rather 
embarrassing and harmful. If a reasoner uses the semantics of typical models, 
this minimizes the expected number of beliefs that may be false in the present 
state of the world. Yet the reasoner would be interested to know more: Which, 
if any, of his or her beliefs are stable in the sense that they remain credible 
under some additions to the system. The set of stable beliefs would possess a 
property of relative monotonicity with respect to these additions. 

The kernel property provides the following nice quality of stability of beliefs 
concerning kernel atoms. 

Theorem 4.5 For all S, all a G tk{S), and any formula (j) that is consistent 
with S and does not contain a in its base, a is a typical kernel atom of 
S' = S U {(j)}. So addition of 4> to S does not require changing the belief in a 
derived from S due to the semantics of typical models. 

Proof. Since the value of (p does not depend on an assignment to 
a G tk{S), if (j) cancels a model m of S* containing -la then it cancels the 
a-neighbor of m containing a, so stih MOD{S' U {-■a}) = MN{S' U {-.a}. 
Hence, a retains its kernel property in S'. So beliefs in a derived from S and 
S' are identical. □ 

Corollary 4.2 Let tk{S) = {ai, . . . ,ak}, and Base{(j)) denote the base of a 
formula (p. Then tk(S) is monotonic with respect to a set of all formulas 4> 
such that Base{(p) (1 {oi, . . . , Uk} = 0. □ 

Example 4.1 S={pV^q\/r, sVv, -ig V r V -is, -lU V -^s, ^pVqV^v, 
s V -If , -^qVrV^u, -ip V n V v , qVv}. 

Table 1 presents data describing S: MOD{S) = {mi, m2, ma, m4, ms}; 

= {^p,q,r, s,^u,v} is the most typical model of S containing its 
typical kernel tk{S) = {^p, r, s, v} (compare \MOD{S L) {^a})\ with 
\MN{S U {^o})| for a € {^p, q, r, s, ^u, v}). 

To illustrate stability of kernel atoms (Theorem 14. 5p let us augment S with 
4) = {-ip V -ig V -ir}. Four bottom rows of Table 1 describe S" = S* U {4>}. 
MOD{S') = {mi, m2, ma, mi\ since (j) cancels m^. Although (p contains kernel 
atom -ip and even negation of kernel atom r, all kernel atoms of S remain such 
in S': tk(S') = tk(S). So in certain cases the stability of kernel atoms extends 
beyond the limits determined by Theorem 14.51 □ 
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Table 1: Typical kernels of S and S' (Example 14. ip 
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5 Typical atoms vs. intuition 

Since beliefs in the truth of typical atoms are more likely to be true in the real 
world than the opposite ones, we may expect that these beliefs should correlate 
with conclusions suggested by human intuition based on life experience. These 
conclusions are supposed to correlate with the semantics of typical models better 
than with any other semantics preferring models different from typical ones. 
The rest of this section presents a rather simple example. 

Example 5.1 (A growing experience) 

Sq = Policeman{Alex) A Criminal{Bob) 

A {\/x){{Policeman{x) — > -^Criminal{x) A ^Dangerous{x)) 

A{Criminal{x) ^Helpful{x))}. (20) 

Suppose that life experience keeps providing additional information AS char- 
acterizing policemen and criminals under certain conditions such that for i > 

ASi = (yx){{Policeman{x) A P JC onditioni{x) — > Helpful{x)) 

f\{Criminal{x) A C JJ onditioni{x) — > Danger ous{x))} . (21) 

For instance, 

ASi = i^x){{Policeman{x) A OnDuty{x) — Helpful{x)) 
A{Criminal{x) A Armed{x) — > Danger ous{x))} . 

Let us ask two questions: "Is policeman Alex helpful?" and "Is criminal Bob 
dangerous?" So consider queries Fi = Helpful{Alex), F2 = Dangerous {Bob). 
□ 
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A common-sense intuition suggests affirmative answers to both queries. 

Denote Si = 5j_i A ASi, and let the domain D of all terms in Si be a finite 
set of names of individuals in the community under consideration. Then from 
expressions (20, 21) we get by induction on i 

\MOD{Si)\ = (2' + l)2(4*+i + 2-'+i + 2)1^1-2 

and 

E{Si, Helpful [Alex)) = E{Si, Dangerous{Bob)) = 1 



2* + 1 
Hence for all i > 



0.5 < E{S^,Helpful{Alex)) = E{Si, Dangerous{Bob)) < 1, 

lim E{Si, Helpful {Alex)) = lim E{Si, Dangerous{Bob)) = 1. 

So for all i > H elp f ul{Alex) and Danger ous{Bob) are typical atoms 
in Si suggesting beliefs in agreement with the common-sense intuition, and 
the larger i the better this agreement. By Corollary 4.1 (ii), Helpful{Alex) A 
Danger ous{Bob) is consistent with all Si. 

Noteworthy, any approach preferring a minimal model yields counter-intuitive 
beliefs in this example. Indeed, by definition, a model m is a minimal model 
of S if there is no model fi of S such that the set of unnegated atoms of ^ is a 
proper subset of the set of unnegated atoms of m. For alH > Si has a single 
minimal model in which all atoms except Policeman{Alex) and Criminal {Bob) 
are negated suggesting that under all circumstances Alex is not helpful and Bob 
is not dangerous — beliefs that are hardly reasonable. 



6 Computing evidence 



Rece r itly several algorithms have been developed for count i ng models (iBavardo et al. 



2000 : lBirnbaum et al 



200fil : ISang et al.l . [20051; iThurlevl . 120061 : IWei et al.l . 1200,51 1 that can be employed 



1999l:lGomes et al.l.l2006l:lLozinskiil .ll992l:lMorg ado et al. 



for computing ev i dence . The following algorithm (based on the algorithm CDP 
teirnbaum et al.l . Il999h l !ias been used in this work for computing evidence of 
prepositional formulas. 

Algorithm 6.1 Given S, let V = {vi, V2, • • • ,^^n} be a set of all propositional 
variables of S. 

1. Apply to S the Davis- Putnam- Log emann-Lov eland procedure iDavis et al. 

196£). Let P^'^^ = {/i, . . . , represent a sequence of truth assignments to lit- 
erals on a path from the root of the search tree to a node. If P^^'^ satisfies S, 
but {/i, . . . , lk-i\ does not, call P^^'^ a satisfying path. Let a full assignment be 
an assignment to all variables of S. Any full assignment containing a satisfying 
path is a model of S. 



14 



2. Any satisfying path P^^^ contributes 2" ^ models to MOD{S), 2" ^ 
models to MOD{S U {I}) for every literal I G P'^^\ and 2^~^~^ models to 
MOD{S U {I'}) for every literal I' such that I' ^ P^^^i and -^l' ^ P^''\ 

3. Let V denote a set of all satisfying paths of S. Then 

\MOD{S)\ = 2''-^ 

and for all literals I 

\M0D{SU{1})\= 2"-'^+ Y 2"-^=-^ 

4. For all v£V, calculate E{S,v) = \MOD{S U {v})\/\MOD{S)\. 

Observation 6.1 For all literals I: E{S,l) = 1 iff I e P for all P e V; 
E{S, I) =0.5 iffl ^P and -^l ^ P for all P eV; I is a typical atom if ^ P 
for all P gV. 



Counting mode ls is a hard computational task that is a ^^P-complete prob- 
lem (|Valiantl . [l979l l. At the present state of the art of computing counting mod- 
els of S requires a time exponential in the size of S. This fact puts many knowl- 
edge collections well beyond the computational power of the existing computers. 
A way to overcome this complexity problem is to resort to an approximation. 
Appendix A presents briefly two methods of computing a fast approximat2ion 
of evidence. 



7 Experiments 

Non-oblivious reasoning preserves consistency of a set of beliefs. However, this 
important feature is achieved at the expense of efficiency. Since it is necessary 
to take into account all beliefs produced previously, non-oblivious reasoning is 
harder computationally than the corresponding oblivious one. 

If a system S has a most typical model then any set of beliefs consisting of 
typical atoms is consistent with S. In this case beliefs regarding typical atoms 
can be produced obliviously which makes reasoning with the most typical model 
efficient. 

Consider a propositional formula S in CNF as a set of C clauses over B 
propositional variables, and let r = C/B denote the clauses-to- variables ratio. 

To gather information regarding existence of most typical models we have 
run experiments with a program that generates random sets of propositional 
clauses and measures their parameters relevant to this study. 

Let p[mtm) be the probability that a system S has a most typical model. 
The closer p{mtm) to 1, the lower the probability of inconsistency caused by 
oblivious reasoning with typical atoms of S. Figure 2 displays p{mtm) and 
ER{mtm) of a set of clauses as functions of r (averaged over 10000 random sets 
with B = 30,100). 
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p(mtm) 
ER(mtm) 




2 3 4 5 



Figure 2: Probability and erratum of a most typical model as a function of r. 



Models of any consistent set of clauses S are arranged in clusters, each 
determined by a satisfying path P^^^ and so containing N = 2^~^ models that 
have k literals in common. In such a cluster the evidence of all A; = i? — log2 N 
common literals is 1, and that of each of the rest of log2 literals is 0.5. Hence 
the average evidence of an atom in a cluster is 1 — (log2 A^)/(2i3). So for a 
system S with M models 1 — (log2 M) / {2B) can be taken as an approximation 
ofE{S). If l-(log2 M)/{2B) is substituted for E{S) in expression (18) then the 
right-hand side of (18) has a minimum at a number of models Mq determined 
by equation 

(l-<^)ln(l-<^) + ^</>i-VB = (22) 

where (/> = (! — (log2 Mq)/{2B))^ . Since the number of models of 5* is a mono- 
tone decreasing function of r, there is a value rg corresponding to Mq at which 
p{mtm) has a minimum as shown in Figure [2j It is worth noting that the er- 
ratum of a most typical model decreases with growing value of r. This is in 
agreement with the common-sense intuition that the more information a system 
contains, the more right conclusions can be derived. 

The clauses-to-variables ratio r of 5 can be calculated in time linear in 
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the size of S, so the value of r is a convenient measure for estimating p{mtm). 
There is another syntactic (and so easily computable) measure of S tha t controls 
features of 5 in a way similar to that of r. This is impurity studied in Lozinski^ 
({2006). 

Let pos{v),neg{v) stand, respectively, for the number of unnegated and 
negated occurrences of a variable w in a set of clauses S. If v occurs in S either 
only unnegated or only negated {neg{v) = or pos{v) = 0) then u is a pure 
variable in S, otherwise v is an impure one. Denote 

max{v) = m.ax{pos{v),neg{v)), min{v) = mm{pos{v),neg{v)). (23) 

Let imp{v) = min[v) / max{v) be called the impurity of and imp[S) stand 
for the impurity of S*, that is the average impurity of its variable: 

1 ^ 

imp{S) = —'y^^min{vi)/max(vi) (24) 
1=1 

< imp{S) < 1. (25) 
It has been shown in ILozinskiil ([2006') that while the impurity of a set of 



clauses S growth from to 1, the probability that S is satisfiable decreases 
and undergoes a phase transition in the vicinity of a certain value of impurity 
depending on r. The number of models of 5" is a monotone decreasing function 
of imp{S) like it is as a function of r. Figure [3] presents p{mtm) and ER{mtm) 
of a set of clauses as functions of its impurity (averaged over 10000 random sets 
with B = 30, 100, r = 4.26, and < imp{S) < 0.92). The patterns are similar 
to those of Figure [2j So given S, both r{S) and imp{S) can be used for a quick 
estimation of the probability that S has a most typical model. 



8 Conclusion 

In general, a knowledge system S describing a real world does not contain 
complete information about it. Reasoning with incomplete information is prone 
to errors since any belief derived from S may turn out to be false in the present 
state of the world. The smaller the expected number of false beliefs produced 
by an approach to reasoning with incomplete information, the more reliable the 
approach. 

In regard to the main goal — choosing a model that would represent the 
reality most faithfully — this work is close to the previous research on reason- 
ing with incomplete information, but presents a completely different approach 
introducing typical models and showing that any knowledge system has a typ- 
ical model that is the most trustworthy one since it minimizes the number of 
false beliefs. So if minimization of reasoning errors is important, the seman- 
tics of typical models is the best one among all approaches to reasoning with 
incomplete information. 
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p(mtm) 




Figure 3: Probability and erratum of a most typical model as a function of imp. 



We consider oblivious and non-oblivious reasoning. The latter unlike the 
former is safe in the sense that it does not cause inconsistency of the set of 
beliefs with S. However, oblivious reasoning is more efficient computationally 
than the corresponding non-oblivious one. 

Under the following conditions oblivious reasoning with typical atoms is 
safe, and the beliefs do not depend on the order in which they were produced: 

(i) If S has a most typical model then oblivious reasoning with all typical 
atoms of S is safe; 

(ii) Oblivious reasoning with all atoms of the typical kernel of S is safe; 

(iii) The higher the probability p{mtm) that S has a most typical model, 
the smaller the probability that oblivious reasoning with typical atoms of S is 
not safe. 
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Appendix A. Approximation of evidence 



Reasonin g with typical m odels involves counting models. This is a #P-complete 
problem dValiantJ llQTfll ^ presenting a highly complex computational task that 
for large logic systems is beyond the power of existing computers. One of 
practical ways to relax this difficulty is using approximation. 



Al. Credible subsets 

Given a system S and a query F, should it be possible to find a subset of S 
informative enough to provide a correct answer to F with a high probability 
and small enough to fit into the range of the available computing resources, the 
answer to F co uld be produced efficiently. This approach has been studied in 



Lozinskiil (jl997l ). 



Definition 8.1 Let L^^^ denote a subset of S consisting of all clauses of S 
containing a literal L or ^L. Call L^^^ the first surrounding of L. For i > 1 
let L*-*-* denote the i-th surrounding of L, that is a set of all clauses of S which 
either belong to L^'''~^^ or share a common variable with a clause of L^^^^\ □ 

An i-th surrounding of L provides an evidence E{L^'^\L) of L that can be 
considered as an approximation of E{S, L) with the approximation error e*-*^ 
such that e(*) = E{L^\L) - E{S,L). A belief in L suggested by E( L^'\L) is 
credib le if it is the same as that provided by E(S,L). As reported in ILozinskii 



(1 19971 ) ■ while i increases, the value of je*-*^! decreases, and the probability that 
a belief suggested by E( L^^\L) is credible approaches 1. For most instances 



tested in iLozinskiil (| 19971 ) the first surrounding provided credible beliefs with a 



high probability, while the corresponding run time was about 10^ times shorter 
than that required for processing of the full S. The credibility of approximation 
increases with the second and further surroundings along with a decrease of the 
run time gain. 



A2. Comparing bounds 

Algorithm 6.1 can be used for computing upper and lower bounds of the size 
of sets of models. 

If a path PC^) = {/i, . . . , /fc} falsifies S but {/i, . . . , Ik-i} does not, call P^^^ 
a falsifying path. Any full assignment containing a falsifying path is a non- 
model of S. Any falsifying path P^^^ contributes 2^~^ non-models to the set 
of non-models of S containing a literal / for every literal I € P^^\ and 2"'"'^"^ 
non-models to the set of non-models of S containing I' for every literal /' such 
that r pC') and ^1' pC^). 

Consider a run of Algorithm 6.1 starting at time Tg and finishing at rj. 
In the course of its run the algorithm discovers more and more satisfying and 
falsifying paths, and accumulates models and non-models. Let J^t{l), A/t(/) 
denote the number of models and non-models containing a literal / counted 
between time and t. Since A4t{l) and J\ft{l) are non-decreasing functions of 
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t, this determines the following bounds of the number of models of S: 
Mt{l) < \MOD{S U {/})! < 2"-^ - Aftil); 
MtH) < \MOD{S U < T-^ -MtH). 

If for an atom a at time r(a) < Tf 

M,(„)(a)>2"-i-A/;(„)(-a) or A^,(„)(-a) > 2"-^ - A/;(„)(a) (26) 

then \MOD{S U {a})\ > \MOD{S U {^a})\, E{S,a) > 0.5, and hence the 
typical atom a = a or, respectively, \MOD{S U {^a})\ > \MOD{S U {a})|, 
E{S^ a) < 0.5, and a = -la. So the typical value a can be determined already 
at time r(a). At this time the bounds give the following approximation of 
evidence: 

' <E{S,a)<l- , ^ \, (27) 



M^(„)(a)+2"-i-AC(a)(-a)~ ' ' '~ A1^(„)(-a)+2'»-i-AC(„)(a) ' 

Let To (a) denote the earliest time at which one of the inequalities (26) holds. 
It can be shown that for all a S Base{S) if \E{S, a) — 0.5| > then ro(a) < Tf 
and the larger the value of |-E(5, a) — 0.5| the larger the run time gain 
(t/ — Ts)/{To{a) — Ts) > 1. So an estimation of evidence and determination of 
the corresponding typical atom can be achieved by means of comparing bounds 
faster than by a full run of Algorithm 6.1. 

Appendix B. Relcixing limitations 

So far we have assumed that all possible worlds represented by the models of 
S are equiprobable and the sets W and \MOD{S)\ are finite. This appendix 
shows an example of how these limitations can be relaxed. 

Bl. Probability of possible worlds 

In most practical cases there is no comprehensive statistical information about 
the world sufficient for calculating the probability p{m) for every model 
m G MOD{S). However, there often is some restricted statistics regarding 
a subset of objects and events of the world. For instance, suppose the prior 
probabilities of certain possible worlds are known (as all the possible worlds 
are mutually exclusive, their mutual conditional probabilities are 0). Let M 
be the set of models of S representing possible worlds with known probability, 
and denote p{M) = J2meM P^^^' '^^^^ assuming that all possible worlds with 
unknown probabilities are equiprobable, we get 

\MOD(S)-M\ , 

If no prior probabilities of possible worlds are known such that M = 0, 
then expression (28) becomes identical to that of Definition 2.2. In another 
special case, if prior probabilities are given for all possible worlds such that 
M = MOD{S) and p{M) = 1, then the evidence E{S,F) amounts to the 
probability p{F). 
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B2. Infinite sets of models 



More research has to be done to extend the notion of evidence to systems with 
infinite sets of models. Here is one possible approach. 

Since the set of predicate symbols occurring in a first-order system S is 
finite, the reason for infiniteness of the set of its models is the infiniteness of 
the domain of its term^. Let D be an infinite enumerable domain of S, d 
denote a finite subset of D, and S^'^^ stand for the original system S for which 
the original domain D is replaced with d. Then S can be viewed as a limit of 
5'('^) while d approaches D. The set of models MOD{S^'^^) is finite allowing the 
following definition. 

Definition 8.2 Given S and its domain D, let di, d2, ... be a sequence of finite 
subsets of D such that limj_>.oo di = D. Then the evidence of a formula F in S 

IS 

2->oo i^co \MOD{S^"-^))\ 

if the latter limit exists. □ 

Applicability of this definition depends on the nature of S, D and F, and 
on a proper construction of the sequence of finite subsets of D for computing 
the limit of F). 

Example 8.1 S = (Vx){(P(x) R{x)) A {Q{x) R{a))}, and the domain of 
X is the set of all natural numbers. 

Let us define di = {1, . . . ,a + i}. Then in S^"^*^ we have: 

If R(a) is false then P{a) is false and for all x £ di Q{x) is false; for every 
value of X G {di — {a}) the clause P{x) — )■ R{x) has 3 satisfying assignments; 
so \MOD{S^'^''^ U {^R{a)})\ = 3"+'-^ 

If R{a) is true then 2 assignments satisfy P{a) — > R(a) and Q{x) — > R{a) 
for all x G di, and 3 assignments satisfy P{x) — > R{x) for all x G {di — {a}); so 
|MOL»(S(*) U {R{a)}) = 2'^+i+i3'^+i-i. 

Hence, 

|M0D(5(*))| = (2'^+^+! + l)3^+'-\ E{S^'^^\ R{a)) = (2'^+^+i)/(2"+^+i + 1). 

(30) 

A similar calculation gives E{S^'^'\ P{a)) = 2"+* / (2"+*+i + 1); 
for all xe{di- {a}) F(5(*),P(x)) = i, E{S^'^^\ R{x)) = |; 
for all xedi E{S^'^'lQ{x)) = 2*^+* / {2"+'+'^ + 1). 

In the limit i ^ oo we get E{S,P{a)) = i, E{S,R{a)) = 1; 
for all natural X 7^ a E{S,P{x)) = l, E{S, R{x)) = I; 
for all natural x E{S,Q{x)) = ^. □ 



^In particular, Herbrand domain of S becomes infinite if S contains function symbols or 
existential quantifiers producing Skolem functions. 
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